Generative Model Configuration
In LLM, we can experiment with generative configuration parameters to influence the way that the model makes the final decision about next-word generation
generative configuration parameters for inference
-
max new tokens
-
greedy vs. random sampling
-
top-k sampling vs. top-p samplingfin
-
temperature
- affect the randomness of the output of the softmax layer in Transformer
- temperature =0 -> reliability
- temperature >0 -> variety